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DETAILED ACTION 

1 . This action is responsive to communications: Amendment, filed 02/23/2006. 

2. Claims 1-31 are pending in the case. Claims 1, 10, 15, 21, 26, and 31 are 
independent claims. 

3. Independent claims 1 and 31 have been amended to overcome the rejections 
under 35 U.S.C. 112, second paragraph, therefore the rejections of 1-8 and 31 under 35 
U.S.C. 112, second paragraph are withdrawn. 

4. Applicants' amendments to the specification are accepted. 

5. Applicant's arguments, see Remarks, p. 12-22, filed 02/23/2006, with respect to 
the rejections of claims 10-30 under 35 USC 103 have been fully considered and are 
persuasive. Therefore, the previous rejections have been withdrawn. However, upon 
further consideration, a new ground(s) of rejection is made over Galai in view of 
DaCosta. 



Claim Rejections - 35 USC § 103 

6. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

7. Claims 1-31 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
Galai et al. (hereinafter "Galai"), PCT Application filed August 2002, International 
Publication Number WO 03/017023 A2, published February 2003, in view of 



Application/Control Number: 10/672,248 Page 3 

Art Unit: 2176 

DaCosta et al. (hereinafter "DaCosta"), U.S. Patent No. 6,665,658, issued 
December 2003. 

Independent claim 1 cites: A method for crawling documents comprising: 
receiving a uniform resource locator (URL); 

receiving at least two different copies of a document associated with the 
URL; and 

determining whether a web site corresponding to the URL uses session 
identifiers based on a comparison of URLs that are within the document and that 
change between the at least two different copies of the document 
Galai teaches a method of indexing dynamic web pages for a search engine. The 
search engine consists of a spider and repository (p. 4, 1. 3-19). Galai teaches a 
method for normalizing the URL of a document to index substantially similar Web pages 
only once (p. 20, 1. 10-20). Galai teaches comparing a Web page with a second 
retrieved web page with reduced parameters, i.e., any divisible subunit of the URL (p. 
20, 1. 10-20). Galai teaches the comparison of URLs within the document where the 
Web page includes one or more links with the complete URL, as for a sessionID (p. 20, 
I. 21 -p. 21, 1. 9), resulting in two web pages which are similar in content but not identical. 
Galai teaches detecting the change between the two different copies of the document 
(p. 21,1. 1-8; p. 27, I. 14-p. 28, 1.21). 

While Galai teaches a comparison of URLs for redundant parameters, which 
would include session identifiers since session identifiers are parameters within a URL, 
Galai does not explicitly teach that the URLs are compared for the specific purpose of 
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determining whether the web site uses session identifiers. However, DaCosta teaches 
a method for crawling documents in a dynamic website, with a database for storing and 
identifying session identifiers URLs, and an application program for controlling a 
software agent (Col. 4, 1. 41 -Col. 5, 1. 23). DaCosta teaches the analysis of URLs and 
headers containing cookie data to determine if a web site uses session IDs (Col. 6, 1. 
21-40). It was notoriously well known in the art at the time of the invention that session 
data for a web site and/or document could be contained in either the URL string, or in a 
cookie. 

Both DaCosta and Galai are directed toward methods for crawling web 
documents and tracking state and session information for web documents. It would 
have been obvious to one of ordinary skill in the art at the time of the invention to 
combine the method of indexing web pages for a search engine by removing redundant 
pages by comparing URL parameters taught by Galai, with the means of identifying 
session identifiers by comparing URL data and cookie data taught by DaCosta, so that 
Galai would have the benefit of identifying session information for a web site whether 
the session information were contained in the URL string or in the cookie, in order to 
remove redundant pages from both configurations (URL string or cookie) of dynamic 
web sites. 

Regarding dependent claim 2, Galai teaches that the method of comparing 
URLs can be applied to any web page in a site (p. 4, 1. 15-20). 

Regarding dependent claim 3, Galai teaches a method of normalizing the URL 
in order to index substantially similar web pages only once (p. 20, 1. 10-23); i.e., 
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comparison of the clean, ornomnalized URL to a set of clean URLs that represent 
previously crawled URLs; since Galai teaches a process for detecting redundant 
parameters in URLs with the same structure, executed once per URL structure, and 
then applied and executed for application to each URL with the same structure (p. 21, 
1.21-p. 22, 1. 5). 

Regarding dependent claims 4-6, Galai teaches that the method of comparing 
URLs can be applied to any web page in a site (p. 4, 1. 15-20). Galai teaches an 
automatic method of URL comparison to remove redundant parameters from pages (p. 
20, 1. 10-20), which would include session IDs (p. 20, 1. 21-23), where the rules are 
determined automatically by comparing the URLs for redundancy and normalizing them. 

Regarding dependent claim 7, Galai teaches a process for detecting 
redundant parameters in URLs with the same structure, executed once per URL 
structure, and then applied and executed for application to each URL with the same 
structure (p. 21, 1.21-p. 22, 1. 5), compare to receiving the URL as a URL from a 
previously crawled web document. 

Regarding dependent claim 8, Galai teaches crawling the URL when the URL 
is determined to not already have been crawled (p. 24, 1. 9-15). 

Regarding dependent claim 9, Galai teaches comparing a portion of the URLs 
that change between the two copies of the document and determining similarity based 
on a predetermined value of the portion of the URLs that change (p. 27, 1. 6- p. 28, 1. 21; 
especially p. 28, 1. 11-21), since Galai teaches automatically determining the redundant 
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parameter by URL comparison and then using that parameter as a basis of comparison 
to other URLs. 

Independent claim 10 cites: A method for identifying web sites that use 
session identifiers comprising: downloading at least two different copies of at least one 
document from a web site; extracting uniform resource locators (URLs) from the two 
different copies of the web document; comparing the extracted URLS of the two 
different copies of the document; and determining whether the web site uses session 
identifiers based on the comparison. 

Galai teaches a method of indexing dynamic web pages for a search engine. The 
search engine consists of a spider and repository (p. 4, 1. 3-19). Galai teaches a 
method for normalizing the URL of a document to index substantially similar Web pages 
only once (p. 20, 1. 10-20). Galai teaches comparing a Web page with a second 
retrieved web page with reduced parameters, i.e., any divisible subunit of the URL (p. 
20, 1. 10-20). Galai teaches the comparison of URLs within the document where the 
Web page includes one or more links with the complete URL, as for a sessionID (p. 20, 
I. 21 -p. 21, 1. 9), resulting in two web pages which are similar in content but not identical. 
Galai teaches detecting the change between the two different copies of the document 
(p. 21, 1. 1-8; p. 27, 1. 14-p. 28, 1. 21). 

While Galai teaches a comparison of URLs for redundant parameters, which 
would include session identifiers since session identifiers are parameters within a URL, 
Galai does not explicitly teach that the URLs are compared for the specific purpose of 
determining whether the web site uses session identifiers. However, DaCosta teaches 
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a method for crawling documents in a dynamic website, with a database for storing and 
identifying session identifiers URLs, and an application program for controlling a 
software agent (Col. 4, 1. 41 -Col. 5, 1. 23). DaCosta teaches the analysis of URLs and 
headers containing cookie data to determine if a web site uses session IDs (Col. 6, 1. 
21-40). It was notoriously well known in the art at the time of the invention that session 
data for a web site and/or document could be contained in either the URL string, or in a 
cookie. 

Both DaCosta and Galai are directed toward methods for crawling web 
documents and tracking state and session information for web documents. It would 
have been obvious to one of ordinary skill in the art at the time of the invention to 
combine the method of indexing web pages for a search engine by removing redundant 
pages by comparing URL parameters taught by Galai, with the means of identifying 
session identifiers by comparing URL data and cookie data taught by DaCosta, so that 
Galai would have the benefit of identifying session information for a web site whether 
the session information were contained in the URL string or in the cookie, in order to 
remove redundant pages from both configurations (URL string or cookie) of dynamic 
web sites. 

Regarding dependent claim 11, Galai teaches comparing a portion of the URLs 
that change between the two copies of the document and determining similarity based 
on a predetermined value of the portion of the URLs that change (p. 27, 1. 6- p. 28, L 21; 
especially p. 28, 1. 11-21), since Galai teaches automatically determining the redundant 
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parameter by URL comparison and then using that parameter as a basis of comparison 
to other URLs. 

Regarding dependent claim 12, Galai teaches that the method of comparing 
URLs can be applied to any web page in a site (p. 4, 1. 15-20). 

Regarding dependent claim 13, claim 13 reflects substantially similar subject 
matter as claimed in dependent claim 2, and is rejected along the same rationale. 

Regarding dependent claim 14, Galai teaches an automatic method of URL 
comparison to remove redundant parameters from pages (p. 20, 1. 10-20), which would 
include session IDs (p. 20, 1. 21-23), where the rules are generated automatically by the 
method of comparing the URLs for redundancy and normalizing them. 

Independent claim 15 cites: A device comprising: a spider component 
configured to crawl web documents associated with at least one web site; and a session 
identifier component configured to determine whether the web site uses session 
identifiers based on a comparison of a portion of uniform resource locators (URLS) that 
change between different copies of at least one web document downloaded from the 
web site. 

Galai teaches a method of indexing dynamic web pages for a search engine. The 
search engine consists of a spider and repository (p. 4, 1. 3-19). Galai teaches a 
method for normalizing the URL of a document to index substantially similar Web pages 
only once (p. 20, 1. 10-20). Galai teaches comparing a Web page with a second 
retrieved web page with reduced parameters, i.e., any divisible subunit of the URL (p. 
20, 1. 10-20). Galai teaches the comparison of URLs within the document where the 
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Web page includes one or more links with the complete URL, as for a sessionID (p. 20, 
I. 21-p. 21, 1. 9), resulting in two web pages which are similar in content but not identical. 
Galai teaches detecting the change between the two different copies of the document 
(p. 21,1. 1-8; p. 27, I. 14-p. 28, 1.21). 

While Galai teaches a comparison of URLs for redundant parameters, which 
would include session identifiers since session identifiers are parameters within a URL, 
Galai does not explicitly teach that the URLs are compared for the specific purpose of 
determining whether the web site uses session identifiers. However, DaCosta teaches 
a method for crawling documents in a dynamic website, with a database for storing and 
identifying session identifiers URLs, and an application program for controlling a 
software agent (Col. 4, 1. 41 -Col. 5, 1. 23). DaCosta teaches the analysis of URLs and 
headers containing cookie data to determine if a web site uses session IDs (Col. 6, 1. 
21-40). It was notoriously well known in the art at the time of the invention that session 
data for a web site and/or document could be contained in either the URL string, or in a 
cookie. 

Both DaCosta and Galai are directed toward methods for crawling web 
documents and tracking state and session information for web documents. It would 
have been obvious to one of ordinary skill in the art at the time of the invention to 
combine the method of indexing web pages for a search engine by removing redundant 
pages by comparing URL parameters taught by Galai, with the means of identifying 
session identifiers by comparing URL data and cookie data taught by DaCosta, so that 
Galai would have the benefit of identifying session information for a web site whether 
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the session information were contained in the URL string or in the cookie, in order to 
remove redundant pages from both configurations (URL string or cookie) of dynamic 
web sites. 

Regarding dependent claim 16-17, Galai teaches a spider to download content 
from a network and a component of the autonomous software search program to extract 
URLs from the downloaded content (p. 26, 1. 19-p. 27, 1. 5) compare to fetch component 
configured to download content from a network; and a content manager configured to 
extract URLS from the downloaded content. 

Regarding dependent claim 18, claim 18 reflects substantially similar subject 
matter as claimed in dependent claim 2, and is rejected along the same rationale. 

Regarding dependent claim 19, Galai teaches that the method of comparing 
URLs can be applied to any web page in a site (p. 4, 1. 15-20). 

Regarding dependent claim 20, Galai teaches an automatic method of URL 
comparison to remove redundant parameters from pages (p. 20, 1. 10-20), which would 
include session IDs (p. 20, 1. 21-23), where the rules are generated automatically by 
comparing the URLs for redundancy and normalizing them. 

Regarding independent claim 21, claim 21 is directed toward the device used 
for implementing the method as claimed in claim 10, and is rejected along the same 
rationale. 

Regarding dependent claims 22 and 23, claims 22 and 23 reflect substantially 
similar subject matter as claimed in dependent claims 1 1 and 12, and are rejected along 
the same rationale. 
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Regarding dependent claim 24, claim 24 reflects substantially similar subject 
matter as claimed in dependent claim 2, and is rejected along the same rationale. 

Regarding dependent claim 25, claim 25 reflects substantially similar subject 
matter as claimed in dependent claim 14, and is rejected along the same rationale. 

Regarding independent claim 26, claim 26 is directed toward the computer- 
readable medium containing programming instruction for executing the method as 
claimed in claim 10, and is rejected along the same rationale. 

Regarding dependent claims 27 and 28, claims 27 and 28 reflect substantially 
similar subject matter as claimed in dependent claims 11 and 12, and are rejected along 
the same rationale. 

Regarding dependent claim 29, claim 29 reflects substantially similar subject 
matter as claimed in dependent claim 2, and is rejected along the same rationale. 

Regarding dependent claim 30, claim 30 reflects substantially similar subject 
matter as claimed in dependent claim 14, and is rejected along the same rationale. 

Independent claim 31 cites: A method for crawling documents comprising: 
receiving a uniform resource locator (URL), and determining whether the URL is 
associated with a web site that uses session identifiers based on a comparison of 
content between different duplicate or near-duplicate copies of a document downloaded 
from the web site. 

Galai teaches a method of indexing dynamic web pages for a search engine. The 
search engine consists of a spider and repository (p. 4, 1. 3-19). Galai teaches a 
method for normalizing the URL of a document to index substantially similar Web pages 
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only once (p. 20, 1. 10-20). Galai teaches comparing a Web page with a second 
retrieved web page with reduced parameters, i.e., any divisible subunit of the URL (p. 
20, 1. 10-20). Galai teaches the comparison of URLs within the document where the 
Web page includes one or more links with the complete URL, as for a sessionID (p. 20, 
I. 21-p. 21, 1. 9), resulting in two web pages which are similar in content but not identical. 
Galai teaches detecting the change between the two different copies of the document 
(p. 21, 1. 1-8; p. 27, 1. 14-p. 28, 1. 21). Galai teaches determining whether session 
identifiers are used based on a comparison of content between different duplicate or 
near duplicate copies of a document (p. 27, 1. 18-p. 28, 1. 11). 

While Galai teaches a comparison of URLs for redundant parameters, which 
would include session identifiers since session identifiers are parameters within a URL, 
Galai does not explicitly teach that the URLs are compared for the specific purpose of 
determining whether the web site uses session identifiers. However, DaCosta teaches 
a method for crawling documents in a dynamic website, with a database for storing and 
identifying session identifiers URLs, and an application program for controlling a 
software agent (Col. 4, 1. 41-Col. 5, 1. 23). DaCosta teaches the analysis of URLs and 
headers containing cookie data to determine if a web site uses session IDs (Col. 6, 1. 
21-40). It was notoriously well known in the art at the time of the invention that session 
data for a web site and/or document could be contained in either the URL string, or in a 
cookie. 

Both DaCosta and Galai are directed toward methods for crawling web 
documents and tracking state and session information for web documents. It would 
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have been obvious to one of ordinary skill in the art at the time of the invention to 
combine the method of indexing web pages for a search engine by removing redundant 
pages by comparing URL parameters taught by Galai, with the means of identifying 
session identifiers by comparing URL data and cookie data taught by DaCosta, so that 
Galai would have the benefit of identifying session information for a web site whether 
the session information were contained in the URL string or in the cookie, in order to 
remove redundant pages from both configurations (URL string or cookie) of dynamic 
web sites. 

Response to Arguments 

8. Applicant's arguments with respect to claims 1-9 and 31 have been considered 
but are moot in view of the new ground(s) of rejection. The new grounds of rejection 
includes the Galai reference, which is being relied upon to teach the newly claimed 
limitations. 



Conclusion 

9. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Miller, et al., "SPHINX: A Framework for Creating Personal, Site-Specific Web 
Crawlers", Proceedings of the Seventh International World Wide Web Conference, 
Brisbane, Australia, April, I998, p. 1-13. 

Seda, C, "Making Dynamic and E-Commerce Sites Search Engine Friendly", Search 
Engine Watch, http://searchenginewatch.com/searchday/article.php/2161081, published 
October 29, 2002, p. 1-5. 
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Quigo, Inc., "Quigo, Inc. Unveils Deep Web Search Technologies", press release dated 
August 15, 2001, p. 1-2. 

Search Tools Consulting, www.searchtools.com, "Generating Simple URLs for Search 
Engines", July 2003, p. 1-6. 

Sun Microsystems, The Java Tutorial, including sections on "Session Tracking", 
"Working with URLs", and "Parsing a URL", p. 1-9. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Amelia Rutledge whose telephone number is 571-272- 
7508. The examiner can normally be reached on Monday - Friday 9:30 - 6:00. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Heather Herndon can be reached on 571-272-4136. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). 
AR 




